Floating Point Representation
Goal of this exercise is to better understand the IEEE 754
representation of floating points, the behavior of operations among
them
and fp-exceptions.
- run paranoia: is your system IEEE 754 conformant?
- Using the "float-as-int union" print exponent and mantissa of a
given number, find "next and previous float", verify the representation
of infinites, subnormals and NaN.
- suggest code to identify subnormals, NaN and infinites either
checking their binary representation or using their "properties"
- Write simple operations among fp and verify the results w.r.t.
the expectations of the equivalent expression using "real
numbers".
- measure the ulp-gap (and its size as fp-number!) between
expectations and results
- Make them generating exceptions and verity which exceptions
have been actually raised.
- Trap the exceptions using SIGFPE
- improve the accuracy of operations using refactoring, Kahan
summation or other technique.
- compare the speed w.r.t. using double-precision
- Apply all this to the minimization exemple
Code
in exercises:
lookInFloat.cpp fpe.cpp
kernels.cc loglike.cpp
in examples:
paranoia.c
Hints
export CXX=g++-451
export CXXFLAGS="-O2 -std=gnu++0x -msse4 -ftree-vectorize -ftree-vectorizer-verbose=1 -pthread -fPIC -fopenmp"
export LD_LIBRARY_PATH=${PWD}/lib:${LD_LIBRARY_PATH}
// use extern to avoid inlining (compile with -fPIC)
// "volatile" can be used to avoid eager optimization
objdump -S -r -C --no-show-raw-insn -w test.o | less
(on MacOS: otool -t -v -V -X test.o | c++filt | less)
References
IEEE 754 on Wikipedia
Single
precision floating point format
man fetestexcept
std::numerical_limits